Skip to content

Conversation

@aemous
Copy link
Contributor

@aemous aemous commented Oct 21, 2025

Description of changes:

Description of tests:

  • Wrote and passed unit tests asserting expected behavior of the primary components and interfaces.
  • Thoroughly tested the tool manually and verified expected behavior.

Example output:

$ upgrade-aws-cli --script examples/upload_s3_files.sh -i --output upload_s3_files_v2.sh

[1/2] binary-params-base64
In AWS CLI v2, an input parameter typed as binary large object (BLOB) expects the input to be base64-encoded. To retain v1 behavior after upgrading to AWS CLI v2, add `--cli-binary-format raw-in-base64-out`.

@@ -1,8 +1,8 @@
#!/bin/bash
# Example script with AWS CLI v1 patterns

-aws secretsmanager put-secret-value --secret-id secret1213 \
-  --secret-binary file://data.json
+aws secretsmanager put-secret-value --secret-id secret1213 \
+  --secret-binary file://data.json --cli-binary-format raw-in-base64-out                                                                                                                                        

if
 aws kinesis put-record --stream-name samplestream --data file://data --partition-key samplepartitionkey ; then

Apply this fix? [y]es, [n]o, [u]pdate all, [q]uit: y

[2/2] binary-params-base64
In AWS CLI v2, an input parameter typed as binary large object (BLOB) expects the input to be base64-encoded. To retain v1 behavior after upgrading to AWS CLI v2, add `--cli-binary-format raw-in-base64-out`.

@@ -5,7 +5,7 @@
  --secret-binary file://data.json

if
- aws kinesis put-record --stream-name samplestream --data file://data --partition-key samplepartitionkey ; then
+ aws kinesis put-record --stream-name samplestream --data file://data --partition-key samplepartitionkey --cli-binary-format raw-in-base64-out ; then
    echo "command succeeded."
fi


Apply this fix? [y]es, [n]o, [u]pdate all, [q]uit: y
Fixed script written to: upload_s3_files_v2.sh

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@aemous aemous requested a review from a team October 21, 2025 15:51
@codecov-commenter
Copy link

codecov-commenter commented Oct 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (upgrade-linting-tool@1ef329d). Learn more about missing BASE report.

Additional details and impacted files
@@                   Coverage Diff                   @@
##             upgrade-linting-tool    #9803   +/-   ##
=======================================================
  Coverage                        ?   93.33%           
=======================================================
  Files                           ?      209           
  Lines                           ?    16807           
  Branches                        ?        0           
=======================================================
  Hits                            ?    15687           
  Misses                          ?     1120           
  Partials                        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@hssyoo hssyoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking good, I like the general experience. Left a few questions and suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the purpose of this Makefile. It looks like it can be largely replaced with modern Python dev tools like uv and ruff (also curious why we're using black and isort here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, I'll look into uv and ruff, and potentially delete this Makefile and update the README dev guide

For a full list of the breaking changes introduced with AWS CLI v2, see
[Breaking changes between AWS CLI version 1 and AWS CLI version 2](https://docs.aws.amazon.com/cli/latest/userguide/cliv2-migration-changes.html#cliv2-migration-changes-breaking).

## Installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the idea was to publish this as a separate PyPI package? Why are we instructing users to build from source?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so this was moreso intended for the devs contributing to this package. I agree most customers would be installing it from PyPi. I can try to make that more clear in the next revision with adding clarifications to this README, let me know if you have additional asks for this

2. Install dependencies:
```bash
pip install -r requirements.txt
pip install -r requirements-dev-lock.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we anticipate users to require dev dependencies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my last comment, this workflow was intended for dev contributors (i.e. us). The dev dependencies are for PyTest and format packages to test and format the package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have setup.py in addition to pyproject.toml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, after some research it seems we only really need one in our case, and that pyproject.toml is the favored way. I will delete setup.py in the next revision

print("Invalid choice. Please enter y, n, u, or q.")


def display_finding(finding: LintFinding, index: int, total: int, script_content: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that's confusing is that a line can show up as a diff even if nothing changed. Using the example file in this commit:

-aws secretsmanager put-secret-value --secret-id secret1213 \
-  --secret-binary file://data.json
+aws secretsmanager put-secret-value --secret-id secret1213 \
+  --secret-binary file://data.json --cli-binary-format raw-in-base64-out

Is this intended behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I tried to keep the diff-generation code simple while also giving a somewhat familiar experience to git diff, and I think this behavior is an artifact of that.

We can pursue more complex diffing logic to address this. If we go down this route, I'd likely utilize difflib which is included in the standard Python library. It has a unified_diff function which may produce what we're looking for.

Below is some simple code I tested locally, seems to show the desired behavior:

>>> src_lines = 'aws secretsmanager put-secret-value --secret-id secret1213 \\\n  --secret-binary file://data.json\n'.splitlines(keepends=True)
>>> dest_lines = 'aws secretsmanager put-secret-value --secret-id secret1213 \\\n  --secret-binary file://data.json --cli-binary-format raw-in-base64-out\n'.splitlines(keepends=True)
>>> diff2 = difflib.unified_diff(src_lines, dest_lines)
>>> sys.stdout.writelines(diff2)
--- 
+++ 
@@ -1,2 +1,2 @@
 aws secretsmanager put-secret-value --secret-id secret1213 \
-  --secret-binary file://data.json
+  --secret-binary file://data.json --cli-binary-format raw-in-base64-out

I'm leaning towards going this route, please share feedback on this. It would likely simplify some of our diff-generation-code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that looks great for our use case

Comment on lines +35 to +36
src_lines_removed = end_line - start_line + 1
new_lines_added = len(dest_lines)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding everything correctly, then src_lines_removed should always be equal to new_lines_added, right? If so, should we add validation on that to guard against potentially corrupting the source file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, yes I expect them to be equal. By calculating it like this it does not lock us in to that as a requirement.

I can imagine down the line us suggesting a 'fix' that takes up multiple lines, in that case they would not be equal. However, we currently have no plans for any of the linting rules to suggest fixes that take up multiple lines. It'll pretty much always be adding a flag to the last line or modifying the command or argument.

I can add this assertion, and there's always the option to remove it if we ever have suggestions for adding multi-line suggestions. Are you recommending it's added as a runtime check? Or introducing a test for this?

I'm leaning towards a runtime check, and raising RuntimeError if they're not equal, with comments explaining some things in this thread. That'll be my plan for next revision unless you share disagreements

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should just stick to line-by-line changes (ie always 1 line changed per 1 line affected). Go ahead and add the runtime check

Comment on lines +92 to +94
elif choice == "q":
print("Cancelled.")
sys.exit(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a bit confusing if a user accepts a few fixes and then presses q to exit early with the expectation that the accepted fixes would be applied.

  • We should explicitly document this behavior.
  • [non-blocking suggestion] And also consider adding a [s]ave option that "flushes" accepted fixes and applies them. If you think we should do this, okay with doing it in a followup PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense. Now that I think about this, I think we can let q be the save and exit option that you're suggesting. Users can always use Ctrl+C to terminate the program immediately.

Let me know if you agree with this. In either case, I would like to do it in a followup PR though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm let's discuss this with the team. I can see advantages to different approaches here

def display_finding(finding: LintFinding, index: int, total: int, script_content: str):
"""Display a finding to the user with context."""
src_lines = script_content.splitlines()
dest_lines = finding.edit.inserted_text.splitlines()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To calculate new_lines_added, is there a reason we're not using LintFinding.line_start and LintFinding.line_end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this relates to a previous reply in another comment. I left open the possibility that there could be more "new lines" than "source lines". line_start and line_end are the line numbers in the source file.

If we end up deciding to be more rigid and force that the number of "new lines" equals the number of "source lines" with an assertion, we may be able use the variables you suggested instead.

With this information, please advise on if you want me to proceed with the more rigid approach, with the assertion that source == new lines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing my other reply: Yeah let's go ahead and make the assertion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants